99 research outputs found
Recommended from our members
Social network support for data delivery infrastructures
Network infrastructures often need to stage content so that it is accessible to consumers. The standard solution, deploying the content on a centralised server, can be inadequate in several situations.
Our thesis is that information encoded in social networks can be used to tailor content staging decisions to the user base and thereby build better data delivery infrastructures. This claim is supported by two case studies, which apply social information in challenging situations where traditional content staging is infeasible. Our approach works by examining empirical traces to identify relevant social properties, and then exploits them.
The first study looks at cost-effectively serving the ``Long Tail'' of rich-media user-generated content, which need to be staged close to viewers to control latency and jitter. Our traces show that a preference for the unpopular tail items often spreads virally and is localised to some part of the social network. Exploiting this, we propose Buzztraq, which decreases replication costs by selectively copying items to locations favoured by viral spread. We also design SpinThrift, which separates popular and unpopular content based on the relative proportion of viral accesses, and opportunistically spins down disks containing unpopular content, thereby saving energy.
The second study examines whether human face-to-face contacts can efficiently create paths over time between arbitrary users. Here, content is staged by spreading it through intermediate users until the destination is reached. Flooding every node minimises delivery times but is not scalable. We show that the human contact network is resilient to individual path failures, and for unicast paths, can efficiently approximate flooding in delivery time distribution simply by randomly sampling a handful of paths found by it. Multicast by contained flooding within a community is also efficient. However, connectivity relies on rare contacts and frequent contacts are often not useful for data delivery.
Also, periods of similar duration could achieve different levels of connectivity; we devise a test to identify good periods. We finish by discussing how these properties influence routing algorithms.This work was supported by a St. John's College Benefactor's Scholarship and a Research Studentship from the Cambridge Philosophical Society
On Factors Affecting the Usage and Adoption of a Nation-wide TV Streaming Service
Using nine months of access logs comprising 1.9 Billion sessions to BBC
iPlayer, we survey the UK ISP ecosystem to understand the factors affecting
adoption and usage of a high bandwidth TV streaming application across
different providers. We find evidence that connection speeds are important and
that external events can have a huge impact for live TV usage. Then, through a
temporal analysis of the access logs, we demonstrate that data usage caps
imposed by mobile ISPs significantly affect usage patterns, and look for
solutions. We show that product bundle discounts with a related fixed-line ISP,
a strategy already employed by some mobile providers, can better support user
needs and capture a bigger share of accesses. We observe that users regularly
split their sessions between mobile and fixed-line connections, suggesting a
straightforward strategy for offloading by speculatively pre-fetching content
from a fixed-line ISP before access on mobile devices.Comment: In Proceedings of IEEE INFOCOM 201
Illuminating an Ecosystem of Partisan Websites
This paper aims to shed light on alternative news media ecosystems that are
believed to have influenced opinions and beliefs by false and/or biased news
reporting during the 2016 US Presidential Elections. We examine a large,
professionally curated list of 668 hyper-partisan websites and their
corresponding Facebook pages, and identify key characteristics that mediate the
traffic flow within this ecosystem. We uncover a pattern of new websites being
established in the run up to the elections, and abandoned after. Such websites
form an ecosystem, creating links from one website to another, and by `liking'
each others' Facebook pages. These practices are highly effective in directing
user traffic internally within the ecosystem in a highly partisan manner, with
right-leaning sites linking to and liking other right-leaning sites and
similarly left-leaning sites linking to other sites on the left, thus forming a
filter bubble amongst news producers similar to the filter bubble which has
been widely observed among consumers of partisan news. Whereas there is
activity along both left- and right-leaning sites, right-leaning sites are more
evolved, accounting for a disproportionate number of abandoned websites and
partisan internal links. We also examine demographic characteristics of
consumers of hyper-partisan news and find that some of the more populous
demographic groups in the US tend to be consumers of more right-leaning sites.Comment: Published at The Web Conference 2018 (WWW 2018). Please cite the WWW
versio
ISP-friendly Peer-assisted On-demand Streaming of Long Duration Content in BBC iPlayer
In search of scalable solutions, CDNs are exploring P2P support. However, the
benefits of peer assistance can be limited by various obstacle factors such as
ISP friendliness - requiring peers to be within the same ISP, bitrate
stratification - the need to match peers with others needing similar bitrate,
and partial participation - some peers choosing not to redistribute content.
This work relates potential gains from peer assistance to the average number
of users in a swarm, its capacity, and empirically studies the effects of these
obstacle factors at scale, using a month-long trace of over 2 million users in
London accessing BBC shows online. Results indicate that even when P2P swarms
are localised within ISPs, up to 88% of traffic can be saved. Surprisingly,
bitrate stratification results in 2 large sub-swarms and does not significantly
affect savings. However, partial participation, and the need for a minimum
swarm size do affect gains. We investigate improvements to gain from increasing
content availability through two well-studied techniques: content bundling -
combining multiple items to increase availability, and historical caching of
previously watched items. Bundling proves ineffective as increased server
traffic from larger bundles outweighs benefits of availability, but simple
caching can considerably boost traffic gains from peer assistance.Comment: In Proceedings of IEEE INFOCOM 201
Wearing Many (Social) Hats: How Different are Your Different Social Network Personae?
This paper investigates when users create profiles in different social
networks, whether they are redundant expressions of the same persona, or they
are adapted to each platform. Using the personal webpages of 116,998 users on
About.me, we identify and extract matched user profiles on several major social
networks including Facebook, Twitter, LinkedIn, and Instagram. We find evidence
for distinct site-specific norms, such as differences in the language used in
the text of the profile self-description, and the kind of picture used as
profile image. By learning a model that robustly identifies the platform given
a user's profile image (0.657--0.829 AUC) or self-description (0.608--0.847
AUC), we confirm that users do adapt their behaviour to individual platforms in
an identifiable and learnable manner. However, different genders and age groups
adapt their behaviour differently from each other, and these differences are,
in general, consistent across different platforms. We show that differences in
social profile construction correspond to differences in how formal or informal
the platform is.Comment: Accepted at the 11th International AAAI Conference on Web and Social
Media (ICWSM17
GASCOM: Graph-based Attentive Semantic Context Modeling for Online Conversation Understanding
Online conversation understanding is an important yet challenging NLP problem
which has many useful applications (e.g., hate speech detection). However,
online conversations typically unfold over a series of posts and replies to
those posts, forming a tree structure within which individual posts may refer
to semantic context from higher up the tree. Such semantic cross-referencing
makes it difficult to understand a single post by itself; yet considering the
entire conversation tree is not only difficult to scale but can also be
misleading as a single conversation may have several distinct threads or
points, not all of which are relevant to the post being considered. In this
paper, we propose a Graph-based Attentive Semantic COntext Modeling (GASCOM)
framework for online conversation understanding. Specifically, we design two
novel algorithms that utilise both the graph structure of the online
conversation as well as the semantic information from individual posts for
retrieving relevant context nodes from the whole conversation. We further
design a token-level multi-head graph attention mechanism to pay different
attentions to different tokens from different selected context utterances for
fine-grained conversation context modeling. Using this semantic conversational
context, we re-examine two well-studied problems: polarity prediction and hate
speech detection. Our proposed framework significantly outperforms
state-of-the-art methods on both tasks, improving macro-F1 scores by 4.5% for
polarity prediction and by 5% for hate speech detection. The GASCOM context
weights also enhance interpretability
- …